Efficient iterative method for solving the 
Dirac-Kohn-Sham density functional theory 

Lin Lin a , Sihong Shao b '*, Weinan E c 

a Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, 

CA 94720, USA 

b LMAM and School of Mathematical Sciences, Peking University, Beijing 100871, China 
c Department of Mathematics and P ACM, Princeton University, Princeton, NJ 08544, 
USA; Beijing International Center for Mathematical Research, Peking University, 
'<-] '. Beijing 100871, China 

Abstract 



We present for the first time an efficient iterative method to directly solve the 
four- component Dirac-Kohn-Sham (DKS) density functional theory. Due to 
the existence of the negative energy continuum in the DKS operator, the ex- 
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isting iterative techniques for solving the Kohn-Sham systems cannot be effi- 
ciently applied to solve the DKS systems. The key component of our method 
is a novel filtering step (F) which acts as a preconditioner in the framework 
of the locally optimal block preconditioned conjugate gradient (LOBPCG) 
method. The resulting method, dubbed the LOBPCG-F method, is able to 
compute the desired eigenvalues and eigenvectors in the positive energy band 
without computing any state in the negative energy band. The LOBPCG- 
F method introduces mild extra cost compared to the standard LOBPCG 
method and can be easily implemented. We demonstrate our method in the 
pseudopotential framework with a planewave basis set which naturally sat- 
isfies the kinetic balance prescription. Numerical results for Pt2, Au2, TIF, 
and Bi 2 Se3 indicate that the LOBPCG-F method is a robust and efficient 
method for investigating the relativistic effect in systems containing heavy 
elements. 
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1. Introduction 

The electron, as an elementary particle, has spin and charge, and further 
acquires an angular momentum quantum number corresponding to a quan- 
tized atomic orbital when binded to the atomic nucleus. The importance of 
the spin-orbit coupling (SOC) effect in semiconductors and other materials 
have been extensively explored in quantum physics and quantum chemistry. 
For example, SOC causes shifts in the atomic energy level of an electron [1], 
and leads to protected metallic surface or edge states as a consequence of 
the topology of the bulk electronic wave functions As a typical rela- 
tivists effect, SOC has magnitude of the order Z A a~ 2 for a hydrogen like 
atom where Z is the nuclear charge, a = c~ l in the atomic unit is the 
fine structure constant and c is the speed of light. For systems containing 
heavy elements with large Z, the nonrelativistic Schrodinger type equations 
such as the Kohn-Sham (KS) density functional theory 0, [|§] leads to large 
error. The extension of the density functional theory is not straightforward 
as quantum electrodynamics has to been used for charged particles in which 
complicated renormalization is necessary to get finite expressions for charge, 
energy, etc. Q. The relativistic density functional theory, first laid out by 
Rajagopal and Callaway 0, can be rigorously derived from quantum elec- 
trodynamics. However, the resulting equations are too complicated to solve 
in practice and proper renormalization has to be performed to eliminate 
the divergent terms. The frequently-used working equations are the four- 
component Dirac-Kohn-Sham (DKS) equations derived by Rajagopal jsf and 
independently by MacDonald and Vosko j9| after making several physically 
reasonable approximations. 

Mathematically, the DKS operator is fundamentally different from the KS 
operator. The spectrum of the DKS operator, sketched in Fig. [fl consists of 
the point spectrum (blue dots), the positive energy continuum (thick blue 
line from +mc 2 to +oo) and the negative energy continuu m ( thin red line 



from — oo to —mc 2 ), and is therefore unbounded from blow 10|. Meanwhile 
the spectrum of the KS operator does not contain any negative energy and 
is bounded from below. Here m is the electron mass and we have m = 1 
and c ~ 137 in atomic unit. When solving the DKS equations numerically, 
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Figure 1: A sketch of the spectrum of the DKS operator. For explanations 
see the text. 



the DKS operator will be discretized by a basis set of finite size, and both 
the positive energy continuum and the negative energy continuum will be 
truncated before infinity is reached. Hereafter we refer to the set of eigenval- 
ues of the discretized DKS operator in the point spectrum and the positive 
energy continuum as the positive energy band, and the set of eigenvalues of 
the discretized DKS operator in the negative energy continuum as the neg- 
ative energy band. There are approximately equal number of eigenvalues in 
the positive and the negative energy bands separated by a large gap, which 
is also called the forbidden region with its width denoted by A. We denote 
by w the maximum of the widths of the positive and the negative energy 
band. The desired eigenvalues and eigenvectors in the positive energy band 
are usually contained in the point spectrum or in very rare cases cover the 
entire point spectrum and a very few low lying states in the positive energy 
continuum. 

Attempts to solve the DKS equations are plagued by the so-called "vari- 



ational collapse" [111 ], which specifically means two difficulties. The first 
difficulty is the so-called spectral pollution - the appearance of spurious 
eigenvalues which are the limiting points of the eigenvalues of the discretized 
DKS operator as the basis set becomes complete, but are not the eigenvalues 
of the true DKS operator. They often lie deeper than the desired solutions 
or may even be degenerate with them. The second difficulty is the occur- 
rence of the spurious eigenvalues in the forbidden region when solving the 
discretized DKS operator using inappropriate numerical schemes. The reason 
for such collapse is that the spectrum of the DKS operator cannot be defined 



variationally [lOj. Several prescriptions to avoid variational collapse were 



summarized and analyzed by Kutzelnigg 12j as well as Lewin and Sere 13| 



Among all the prescriptions, the kinetic balance prescription [14|, [15|, [16 



addresses the first difficulty most successfully from practical point of view, 
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and serves as the foundation for most successful attempts to solve the DKS 
equations via finite dimensional matrix eigenvalue problems. Many effective 
numerical implementations for performing four-component DKS calculations 
with the localized basis sets (e.g. Gaussian type orbitals and atomic orbitals) 
have been presented by quantum chemists in the last four decades, includ- 
ing the DVM scheme 
and coworkers 
DIRAC 



17, 1 
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the BDF package 
the REL4D module in Utchem 
and the BERTHA code 
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the program of Fricke 
the DKS module in 
In all these methods, the finite 
dimensional DKS matrix is diagonalized directly as a dense matrix, which 
computes all the eigenvalues and eigenvectors in both positive and the neg- 
ative energy bands. The direct diagonalization methods avoid the second 
difficulty in the variational collapse originating from inappropriate numeri- 
cal schemes. However, these methods are prohibitively expensive when the 
dimension of the discretized DKS operator becomes large. 

While the localized basis sets are widely used in the community of quan- 
tum chemistry, the planewave basis set is more popular in the community of 
condensed matter physics for the reason that systematic convergence can be 
achieved by only increasing the kinetic energy cutoff. The planewave basis 
set can be used as the only basis set in the pseudopotential framework [24J, 
and as an augmented basis set such as in the linearized augmented planewave 
method (LAPW) j25|. In terms of the DKS density functional theory, the 
planewave basis set automatically satisfies the kinetic balance prescription 
and is free from the spectral pollution. However, compared to the localized 
basis set, the planewave basis set leads to matrix eigenvalue problems of much 
larger size which is impractical to be diagonalized directly as dense matrices. 
Iterative methods have to be designed to solve the matrix eigenvalue prob- 
lems for the desired eigenvalues and eigenvectors contained in the positive 
energy band. There are approximately the same number of eigenvalues in the 
positive and the negative energy bands, and standard iterative techniques for 
solving the KS systems cannot be efficiently applied without necessary mod- 



ification. For example, the conjugate gradient method [26|, |27j, the lo cally 



optimal block preconditioned conjugate gradient (LOBPCG) method 28l 129 



the RMM-DIIS method [30| and the Lanczos method [3l| need to evaluate 
all the states in the negative energy band and therefore is prohibitively ex- 
pensive; It is difficult to design a set of efficient filtering polynomials as in the 
Chebyshev filtering method [32|, since the separation between the occupied 



and unoccupied eigenvalues in the positive energy band (usually in the order 
of 0.1 au or smaller) is much smaller compared to the width of the forbidden 
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region (A 37000 au). The spectral radius of the matrix resulting from the 
spectrum folding technique 33] will be in the order of A 2 10 9 for the DKS 
equations, and the resulting positive definite eigenvalue problem cannot be 
solved efficiently with iterative methods. 

In this work, an efficient iterative method to directly solve the four- 
component DKS density functional theory will be presented for the first 
time. The key component of our method is a filtering step (F) which can 
act as a preconditioner in preconditioned iterative diagonalization methods. 
In particular, our method is demonstrated based on the LOBPCG method, 
and is therefore dubbed the LOBPCG-F method. Compared to other types 
of preconditioned conjugate gradient methods, the LOBPCG method has 
been shown to be effective for evaluating a relatively large number of eigen- 
values and eigenvectors, and its efficiency has been illustrated in large scale 



electronic structure calculation such as in ABINIT |34j . [35[. However, for 
the DKS systems, the standard LOBPCG method amplifies the error of the 
eigenfunctions projected to the negative energy band in each iteration, and 
therefore needs to evaluate all the eigenfunctions in the negative energy band 
together with the desired eigenfunctions in the positive energy band. We il- 
lustrate that such error can be efficiently controlled by the filtering step, and 
the desired eigenvalues and eigenvectors in the positive energy band can be 
evaluated without computing any negative energy state. The filtering step 
only introduces 2 extra matrix-vector multiplication per iteration, and can 
be implemented simply as a preconditioner with little coding effort. 

We implement the LOBPCG-F method to solve the DKS equations in 
the pseudopotential framework with a planewave basis set (module DKS). To 
benchmark our numerical results, we also implement the standard LOBPCG 
method for solving the KS density functional theory (module KS), and for 
solving the two-component KS density functional theory with SOC in the 
pseudopotential framework (module KSSO). We directly compare the to- 
tal energies obtained from the modules KS and KSSO with those obtained 
from the corresponding modules in a popular electronic structure software 



ABINIT [35(. We apply the modules KS, KSSO and DKS to solve systems 
including Pt2, Au2, TIF, and Bi 2 Se3. The differences in the total free en- 
ergy are less than 1 meV per atom compared to ABINIT for all systems 
under study. Despite that the relativistic correction is relatively small for 
valence-valence interaction, our numerical results for solving the DKS equa- 
tions indicate that the LOBPCG-F method is robust and efficient in studying 
the relativistic effect in systems containing heavy elements. 
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All the discussion in the rest of the manuscript will be presented in 
the pseudopotential framework in the context of orthogonal basis set (the 
planewave basis set is orthogonal). We remark that the LOBPCG-F method 
can be used for all-electron calculation and for non-orthogonal basis func- 
tions as well. This is the case for the LAPW method such as implemented 



in the WIEN2k software [36[, and also for the localized basis set given that 
the overlap matrix is not very ill-conditioned. 

The paper is organized as follows. We briefly review the four-component 
DKS density functional theory in Section [2] and discuss in detail the vari- 
ational collapse and the choice of basis set. We develop the LOBPCG-F 
algorithm in Section [3J The numerical results are presented with discussion 
in Section HI The paper is concluded in Section Throughout the paper, 
the atomic units (e — H — 1) will be used unless otherwise noted. 



2. DKS density functional theory 

We will now describe the main working equations - the DKS equations 
derived by Rajagopal jsj and independently by MacDonald and Vosko [if, 



and refer to Engel [6| and van Wiillen [37| for a elaborate and general discus 



sion on relativistic density functional theory which can be rigorously derived 
from quantum electrodynamics. Several related numerical issues are then 
discussed, such as the variational collapse and the choice of basis in solving 
the DKS matrix eigenvalue problems. 

2.1. DKS equations 



Under the no-pair and electrostatic approximations p, |37|, the total en- 
ergy of an interacting n-electron system corresponding to the Dirac-Coulomb 
Hamiltonian takes the form 

E\pM=T\pM+l Kxt(r)p(r)dr+i / P( T M T *) dr x dr 2 +£; xc [p, m], (1) 

J * J I r l — r 2| 

where p(r) is the electron density, m(r) is the spin density, and V ext (r) is 
the nuclear attractive potential. The first term T is the kinetic energy of the 
noninter acting reference system suggested by Kohn and Sham j^], the third 
term is the electrostatic energy, and E xc is the exchange-correlation functional 
containing both the difference between the true many body electron-electron 
repulsive energy and the electrostatic energy, and the difference between the 
true many body kinetic energy and the kinetic energy of the noninteracting 



6 



reference system. This factious system, which could be represented by a 
single Slater determinant in terms of one-electron spinors {^i} corresponding 
to the energy levels {ei}, has the same electron density as the interacting 
many body system. In consequence, we could use {ifji} to evaluate the kinetic 
energy and the densities 



mc 2 l2 
ccr • p 



OCC s 

T = ^^(r)W(r), h B = I 

i ^ 

OCC OCC s 

p( r ) = ^*( r ) ViW, m ( r ) = ( 



ccr • p 

—mc 2 I<: 



cr 

2 



2 



(2) 



^(r), (3) 



where /id is the Dirac operator corresponding to free particles, p = — iV is 
the momentum operator, I n and n are the n x n unit and null matrices, 
°" = (^x; ^z) is the vector of the Pauli spin matrices, and the each spinor 
ifji is a complex vector- valued function: M 3 — > C 4 , which are often rewritten 
as ipi = {4>i,Xi) T with (/>i,Xi being functions: M 3 — > C 2 . In what follows we 
will often refer to the two-spinor 0« (resp. Xi) as the large (resp. small) 
component of the four-spinor ipi. Here the summations are only restricted to 
the occupied states in the positive energy band. 

The Euler-Lagrange equation with respect to E[p, m] gives rise to the 
DKS equation 



mc 2 L 



ccr ■ p 



ccr ■ p 

-cr ■ B xc — mc 2 l2 



VhxcM + Kxtl4 



where 



Vhxc(r) 
B xc (r) 



r dri 



5p 



SE XC 
5m 



(4) 

(5) 
(6) 



Here for simplicity the exchange-correlation functional under local density 
approximation (LDA) [38[ [39[ is used. Our discussion below is not restricted 
to the LDA approximation. 

Ideally all-electron calculations should be performed to obtain accurate 
results, but with large computational cost in order to resolve the sharp gradi- 
ent of core orbitals and the oscillation of valance orbitals in the neighborhood 
of nuclei. A more practical way striking the balance between accuracy and 
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efficiency is to employ effective core potentials |40|, in which the core elec- 
trons are frozen and the valance-only problem is solved. The pseudopotential 
technique is one branch of effective core potential approaches. One should 
be careful in choosing pseudopotential for a relativistic Hamiltonian such as 
DKS, since most of the existing pseudopotentials are generated for the non- 
relativistic Hamiltonian. The mismatch between the pseudopotential and the 
relativistic Hamiltonian introduce possibly double counting of the relativis- 



tic effect [41(. We adopt here the widely known HGH pseudopotential [42 
which includes the scalar relativistic effect and the spin-orbit coupling ef- 
fect of the core electrons by construction, and can be specified by a very 
small number of parameters due to its dual-space Gaussian form. Rigorously 
speaking, the HGH pseudopotential may still suffer from double counting 
of the relativistic effect, since the parameters are optimized using the non- 
relativisitic equations. On the other hand, the numerical method developed 
in this paper does not depend on the specific choice of parameters of the 
HGH pseudopotential, and the numerical results can be readily improved 
when the HGH pseudopotential is reparametrized for the DKS calculation. 
The HGH pseudopotential consists of three parts 

V HGH (r, r') = \/ loc (r)5(r - r')I 4 + £ (V,(r, r')I 4 + A^ so (r, r')L' . S) , (7) 

i 

where the subscript / is the angular momentum number, Vj oc , VJ, and AVJ so 
represent in the order the local contribution, the nonlocal contribution and 
the SOC effect of the HGH pseudopotential, of which the formulas can be 



found in [42| and thus are skipped here to save space. S(r) is the Dirac delta 



function, L' is the angular momentum at position r', and S = \ ( 5" ^ 2 ] is 

the spin operator. The HGH pseudopotential Vhgh is an integral operator 
on the spinors. After denoting the first matrix operator of the LHS term of 
Eq. (TJJ by H , we finally arrive the DKS equation for valence electrons 

[H + VWI4] &(r) + J V HGH (r, r')^(r')dr' = ^(r)e 4 . (8) 

The nuclear attraction term V cxt in Eq. (TJJ is replaced here by the Vhgh 
term that describes the electrostatic interaction between the effective nuclei 
and valance electrons. The DKS equations @ or (jSJ) are usually solved with 
the self-consistent field (SCF) iteration as in the case of the nonrelativistic 
KS equations [43 
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2.2. Variational collapse and kinetic balance 

As we have mentioned in Section [TJ the most salient feature of the DKS 
operator is that its spectrum contains the negative energy continuum and is 
therefore unbounded from below. The desired eigenvalues and eigenvectors 
corresponding to the occupied bound states lie in the positive energy band 
(see Fig. [1]), i.e. we are seeking for highly excited states above all the nega- 
tive energy states. When solving the resulting matrix eigenvalue problems by 
expanding ipi in Eq. (JH]) with a finite basis set, such feature of the DKS oper- 
ator imposes a larg e obstacle, i.e. the variational collapse called by Schwarz 
and Wallmeier 111 ], in the sense that the eigenvalues of the discretized DKS 



operator may not systematically converge to the eigenvalues of the true DKS 
operator as the basis size increases. This variational collapse, is imputed by 
Kutzelnigg 12| to the wrong nonrelativistic limit of the matrix representa- 
tion of the Dirac operator and several prescriptions have been proposed to 
avoid it. The first strategy is to replace minimization procedures by min-max 



procedures [4J, |45j, |46j. The min-max methods find the minimum over the 
large component of the spinors and the maximum over the small component 
of the spinors for the DKS energy functional, and these methods are not used 
often practically in electronic structure calculation. The second strategy is 
the two-component relativistic theory which seeks for operators which are 
bounded from below and agrees with the DKS operator in the nonrelaticistic 
limit 



47 



be found in 48 



A very good discussion of approaches made in this direction can 
The third strategy is to carefully choose the basis set with 



which the discretized DKS operator is free of the spectral pollution. Among 



16 



48 



falling into the 
and it serves as 



all the strategies, the kinetic balance method [14j, [15 
third category, is widely used by theoretical chemists 
the basis of most successful attempts to solve matrix eigenvalue equations 
based on finite-dimensional representations of the DKS operator. The feature 
of the kinetic balance is the one-to-one correspondence of the basis sets that 
are used to expand the large and small components, achieved via a linear 
operator er • p. Specifically, the basis {fk, , k — 1, ■ • • , N} for expanding the 
small components are obtained directly from the basis {gk, k = 1, • ■ ■ , N} for 
expanding the large components, according to the so-called kinetic balance 
prescription 

1 



fk 



2mc 



<r ■ P9k, 



N. 



(9) 



The use of the kinetic balance avoids the worst aspects of variational collapse 
and could generate the desired eigenvalues and eigenvectors in the positive 



9 



energy band with good accuracy if a sufficiently flexible basis set is employed 



15l . Il6l . Il3l . |49| . The localized basis sets satisfying the kinetic balance pre- 
scription (Q are often used in quantum chemistry and all the eigenvalues 
and eigenvectors of the discretized DKS operator are obtained by the direct 
diagonalization method including the negative energy band that are not used 
in practice. 

2.3. Choice of basis set: Planewaves 

Compared to the standard localized basis sets used in the relativistic 
quantum chemistry community, the planewave basis set has the advantage 
that systematic convergence can be achieved by tuning one parameter - the 



kinetic energy cutoff |43|. Furthermore, the planewave basis set automati- 
cally satisfies the kinetic balance prescription fl9]), i.e. the space spanned by 
planewaves within a certain cutoff is invariant when applied by the cr • p op- 
erator, and therefore the planewave basis set is a good basis set for both the 
large component and the small component. Namely, the planewave basis set 
can avoid the spectral pollution in practice. However, the planewave basis 
set usually results in a much larger degrees of freedom per atom (in the order 



of 500 ~ 5000 or more for standard norm conserving pseudopotential [24 
with only valence electrons taken into consideration), which is much larger 
than the degrees of freedom per atom used by localized basis set. We remark 
that the degrees of freedom used by the finite difference method and the finite 
element method will also be much larger than the degrees of freedom used by 
localized basis set, but it is less obvious how the kinetic balance prescription 
will be enforced in these methods. 

Due to the large matrix size after the planewave discretization, the direct 
diagonalization method will not be applicable. The discretized Hamiltonian 
operator in generated by the planewave basis set, hereafter also referred 
to as the Hamiltonian matrix, is dense in both the Fourier space and the real 



space, which discourages the usage of the shift-invert Lanczos method [50 



the contour integral based spectrum slicing metho ds [5 if . or the Fermi oper 



ator expansion based low order scaling algorithms |52l. l53l. l54j . Furthermore, 



isJpL 

MM 



techniques that directly eliminate the negative energy band, such as exact 



two-component theory |48[] involves direct manipulation of the Hamiltonian 
matrix, which is again not possible to handle in the case of the planewave ba- 
sis set as a result of the large matrix size. Iterative methods have to be used 
to diagonalize the discretized Hamiltonian operator, which will be discussed 
in detail in Section [3J 
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3. An efficient iterative method for solving the DKS equations 

As mentioned in Section [TJ there are approximately the same number of 
eigenvalues and eigenvectors in the positive and the negative energy bands, 
separated by a large spectral gap with width A as shown in Fig [TJ Existing 
iterative techniques for solving the KS equations cannot be efficiently applied 
to solve the DKS equations without necessary modification. In this section 
we demonstrate that the negative energy band of the DKS operator can be 
efficiently eliminated using only 2 additional matrix-vector multiplications 
per iteration, based on the locally optimal block preconditioned conjugate 
gradient (LOBPCG) method 28|. The key component of the proposed new 



method is an additional filtering (F) step, and we therefore refer to our 
new method as the LOBPCG-F method. We remark that a similar filtering 
procedure can also be applied to other preconditioned iterative solvers to 
eliminate the negative energy band, such as the preconditioned Davidson 



type methods [55 



3.1. LOBPCG-F algorithm 

The LOBPCG-F algorithm is described in Alg. [TJ It can be easily found 
there that removing the new filtering steps underlined in Alg.[TJyields directly 



the standard LOBPCG algorithm as in [28(. The matrices if, T and f(H) 
used in Alg. [TJ can be defined in the "matrix- free" form, in the sense that they 
are defined according to subroutines which only computes Hx, Tx, f(H)x for 
any vector x without forming the matrix elements explicitly. C is a constant 
used as a shift in the filtering function, which is chosen from the negative 
energy continuum as shown in Fig. [TJ 

We first demonstrate how the error of the eigenfunctions projected to 
the negative energy band is propagated in the standard LOBPCG method. 
To simplify the analysis, we assume here that the preconditioner T to be 
an identity operator. The qualitative result remains to be valid if standard 
preconditioners in the electronic structure calculation are applied, such as 



the preconditioner proposed by Teter et al [27], |56|. Furthermore, we need a 
key assumption that 

A>w, (10) 

which is valid in the pseudopotential framework with the planewave basis set 
for A pd 37000 and w ~ 100 hold there. The foregoing assumption is also 
valid in the all-electron calculation using the LAPW basis set, or using the 
localized basis set given that the basis set remains well conditioned. However, 
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Algorithm 1 The LOBPCG-F algorithm for solving the DKS density func- 
tional theory. The underlined steps describe the new filtering step compared 
to the standard LOBPCG method. The filter is given in Eq. (fl~6l) . 

Input: Subroutine Hx (Tx) to multiply the Hamiltonian (precondi- 
tioning) operator to a vector, the shift constant C, the nor- 
malization constant f2, the number of eigenvalues n, the ini- 
tial guess of eigenvectors Xo £ R Afxn , the number of initial 
filtering steps Njjat, the maximum number of iterations N[t e r, 
the tolerance for convergence e. 

Output: A = diag(Ai, . . . , A n ) where {Aj}" =1 are the lowest n eigen- 
values in the positive energy band, and X € R Arxn are the 
associated eigenvectors. 

1: for k = 1, . . . , iVinit do 
2: Set X <- f(H)X . 
3: end for 

4: Solve V, A\ € W nxn from the generalized eigenvalue problem 
(XqHXq)V = (Xq Xq)VAi, with the diagonal elements of Ai sorted 
in a non-decreasing order. 

5: Set Xi <- X V. 

6: Compute the residual R -(— HX\ — X\A.\. 
7: Set P <— \\ to be an empty matrix. 
8: for k = 1, . . . , A^er do 

9: Solve the preconditioned residual Y from TY = R. 
10: Set Y <- f(H)Y. 

11: Construct a subspace Z = [X k , Y,P], Z € R Nxn z and n z > n. 

12: Solve V, Afc + i G K" zX " z from the generalized eigenvalue problem 

(Z T HZ)V = (Z 7 Z)V Afe + i, with the diagonal elements of A^ +1 sorted 

in a non-decreasing order. 
13: Set V <— the first n columns of V. 
14: Set Afc+i <— the upper-left n x n block of Afc+i. 
15: Set X k+l <- ZV. 

16: Compute the residual R HX^+i — Xk+\Ak+\- 
17: if the norm of each column of R is less than e then 
18: Exit the loop. 
19: end if 

20: Set R <— the columns of R with norm larger than e. 

21: Set V the columns of V corresponding to remaining columns of R. 

22: Set P <- [0,Y,P]V. 
23: end for 

24: return X 4— Xk+i, A «— Afc+i. 

12 



we remark that w can be artificially large when the basis set is overcomplete 
and then the assumption (TTU]) will not be valid anymore. 

The LOBPCG method computes the residue R = HX — XA once per 
iteration (line 16 in Alg. [TJ). Denote by x a given column of X, A the cor- 
responding eigenvalue in A, and r and y the corresponding column in the 
residue R and the preconditioned residue Y, respectively. We express r and 
x using the eigen-decomposition as 



r • 

3 3 



(ii) 



x = + E t 



Here ipf is the spinor corresponding to e[ in the positive energy band, and 
ipj the spinor corresponding to e~ in the negative energy band. We further 
assume that the error of the eigenfunctions projected to the negative energy 
band, characerized by max.,- \x~\, is initially very small but not yet vanishes. 
Since r = Hx — xX, the coefficients {rf}, are related to the coefficients 
{xf}, {xj} according to 



X)xf, fj = (ej-\)xj, (12) 



implying that the error in the residue ff (resp. f~) is amplified by ef — 
A (resp. ej — A) from xf (resp. xj). In order to quantify the relative 
amplification of the error projected to the negative energy band in the residue 
r compared to the amplification of the error projected to the positve energy 
band, we define the amplification factor as follows 



maxj 


IS 


- A| 


maxj 


4 


- A 



7lobpcg = max ^ -r. (13) 



Since A is a desired eigenvalue contained in the positive energy band, the 
assumption ( fXOT) implies further that 

|ej — A| ~ A, |e+-A|~w. (14) 

In consequence, we have 

7lobpcg ~ —7 (15) 
w 
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and the relative error projected to the negative energy band in r is amplified 
by a factor A/w compared to that in x. The error remains in the precondi- 
tioned residue y and propagates into the subspace used in the next LOBPCG 
iteration (line 11 in Alg. [TJ). In case of # = 300, even if the initial error is 
small, say maxj \x~\ ~ 10~ 15 , only after 6 iterations, the error of the eigen- 
functions projected to the negative energy band will accumulate to 0(1) in 
the worst case scenario. 

In order to compensate for the amplified error projected to the negative 
energy band, the LOBPCG-F method applies a filtering step to the precondi- 
tioned residue Y before the error propagates to the next LOBPCG iteration 
(line 10 in Alg. [TJ). The filtering function takes the form 

f(E) = ^ 2 (E-C)\ (16) 

where the shift constant C is chosen to be in the negative energy continuum 
(see Fig. [T]), and Q is a normalization constant so that f(E) = 1 when E 
is the largest eigenvalue in the positive energy band. We remark that al- 
though filtering functions of other forms can also be employed, the quadartic 
filtering function (fT6|) requires only 2 extra matrix-vector multiplications, 
and achieves nearly the minimum cost. Under the assumption ( JTOl) . we have 
Q ~ A. In the following we show how the filter ( fl6l) works. 

Again using the eigen-decomposition for y after the filtering step (line 10 
in Alg. [T|) and assuming the identity preconditioner T lead to 

» i 

where 

yt = f(4)(4~^ti y- = f(ej)(ej-\)xj. (18) 
The amplification factor of the LOBPCG-F method becomes 



max, /( e _. )( 6j 




maxi /(e+)(e+ 





7lobpcg-f = max , J J —p. (19) 



Again under the assumption (flOj) . we have 

\f(ej)(ej - A) | ~ |/(e+)(e+ - A)| ~ (20) 
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and then 

w 

TLOBPCG— F ~ (21) 

Therefore after each filtering step, the relative error projected to the negative 
energy band in y is reduced by a factor w/A compared to the relative error 
in x. As mentioned in Section [TJ the spectral radius of the matrix originated 
from the spectrum folding type methods [33| deteriorates as A 2 when A in- 
creases to infinity, and an efficient iterative method is difficult to be designed 
for the resulting matrix eigenvalue problem. On the contrary, the efficiency 
of the LOBPCG-F method improves as A increases, characterized by the 
factor w/A. 

We plot the shape of the filtering function (fl6|) for the case w = 100 au 
with a very small gap A = 300 au (correspondingly c ~ 12 au) in Fig. [2] (a) 
and for w = 100 au with a larger gap A = 3000 au (correspondingly c ~ 39 
au) in Fig. [5] (b). We observe that even for such relatively small gaps, the 
filtering function already quickly reduces the error in the negative energy 
band. The filtering function ffl6l) is increasingly more effective as the gap 
A increases. In the practical calculation, the gap A « 37000 au (c « 137 
au), and thus the filtering function can effectively control the error of the 
eigenfunctions projected to the negative energy band in each LOBPCG-F 
iteration. 

Finally we need to address the initial condition such that max,,- \xj\ is 
small. This can be achieved by the lines 1-3 in Alg. [1] using iV init steps of 
the same filtering function. A high quality input of the initial vectors X can 
help reduce the number N in[t . In the DKS calculation, a good set of initial 
vectors can be obtained by using the wavefunctions obtained from the KS or 
the KSSO calculations. The converged KS or KSSO wavefunctions can be 
used as the initial guess for the large component of the DKS spinors, and the 
initial guess for the small component of the DKS spinors can be obtained by 
applying the kinetic balance prescription to them. The good input of the 
initial vectors is also readily available in consecutive SCF steps, in which X 
simply takes the output X in the previous SCF step, and the error projected 
to the negative energy band in X is negligible as controlled by the LOBPCG- 
F method. Using such strategy, we find that the LOBPCG-F algorithm is 
robust by setting N init = 3 for the first SCF iteration (for safety), and then 
setting TVinit = 1 in the following SCF steps. 
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Figure 2: The filtering function f(E) for w = 100 au with a small gap 
A = 300 au (a) and with a large gap A = 3000 au (b). The function values 
correspond to the positive energy band (blue solid line) and the negative 
energy band (red circles) are separated by the function values in the spectral 
gap (black dotted line). 

3.2. Implementation 

Compared to the standard LOBPCG algorithm which applies the Hamil- 
tonian matrix once per iteration step, the LOBPCG-F algorithm applies the 
Hamiltonian matrix for 2 extra times per iteration due to the filtering step 
(116j) . The computational cost of the numerical linear algebra operations, such 
as for the orthogonalization step and for the projected Rayleigh-Ritz eigen- 
value problem, remains to be the same as that in the standard LOBPCG 
algorithm. Furthermore, the implementation of LOBPCG-F algorithm is 
very simple. Since the implementation of the LOBPCG algorithm always 
provides an interface for the preconditioner, one only needs to apply the fil- 
tering function after the standard preconditioner to eliminate the negative 
energy states components as shown in Alg. [TJ and the overall framework 



for LOBPCG, such as that provided by BLOPEX [29(] does not need to be 
changed. 

4. Numerical results 

We implement the LOBPCG-F method for solving the DKS system (jSj) 



using the HGH pseudopotential [42|, with the local and nonlocal pseudopo- 
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tential implemented fully in the real space [57J. The exchange- correlation 
functional under local density approximation (LDA) is used. Together with 
the module DKS for solving the four-component DKS system, we also imple- 
ment the standard LOBPCG method for solving the KS density functional 
theory (module KS), and for solving the two-component KS density func- 
tional theory with SOC in the pseudopotential framework (module KSSO). 
The major difference between the module KS and KSSO is that KSSO in- 
cludes the spin-orbit coupling term between the core electrons and the valence 
electrons in the form of Y2i Vi ( r > r')L/ ■ S as in Eq. (1T1) . The real and com- 
plex arithmetics are equally handled in our implementation. The description 
of the modules KS, KSSO and DKS is summarized in Table [TJ Since our 
main focus is the relativistic effect such as SOC, the module KS directly uses 
a two-component spinor rather than a one-component spinor. However, this 
does not lead to changes in the computed physical quantities such as the 
total energy. The availability of modules KS and KSSO allows us to directly 
benchmark our implementation with existing software such as ABINIT 35 
for electronic structure calculation. To facilitate the presentation of the nu- 
merical results, we adopt the standard convention that all energies are shifted 



by 



-mc 



so that the positive energy continuum starts from rather than 



+mc 2 . 



Table 1: The number of components for describing a spinor (iV S pi n0 i.), the 
arithmetic, and the diagonalization solver for modules KS, KSSO and DKS 
in our implementation. 



Module 


N ■ 

1 > spmor 


Arithmetic 


Solver 


Description 


KS 


2 


Real 


LOBPCG 


Kohn-Sham 


KSSO 


2 


Complex 


LOBPCG 


Kohn-Sham with 










spin-orbit coupling 


DKS 


4 


Complex 


LOBPCG-F 


Dirac-Kohn-Sham 



More computational details of our implementation and the numerical re- 



sults are as follows. The preconditioner proposed by Teter et al [271 . 156 



is em- 



ployed by both LOBPCG and LOBPCG-F methods. Anderson mixing [58 



with Kerker preconditioner [59| is used for the SCF iteration. Gamma point 
Brillouin sampling is used for simplicity for all calculations, even for crystal 
systems such as Bi2Se3. This is because the purpose of this manuscript is 
to demonstrate the capability of the LOBPCG-F method for solving DKS 
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systems. The support of fc-point sampling will be added in the future work. 
The density and the wavefunction are resolved using the same grid in both 
real space and Fourier space, and the grid size is measured by the corre- 
sponding kinetic energy cutoff in the Fourier space in atomic unit, denoted 
by E cut . All computational experiments are performed on the Hopper sys- 
tem at the National Energy Research Scientific Computing (NERSC) center. 
Each Hopper node consists of two twelve-core AMD "MagnyCours" 2.1-GHz 
processors and has 32 gigabytes (GB) DDR3 1333-MHz memory. Each core 
processor has 64 kilobytes (KB) LI cache and 512KB L2 cache. All modules 
KS, KSSO and DKS are implemented sequentially, and one core processor is 
used in each computation. 

4-1. Setup 




Figure 3: The converted orthorhombic supercell for describing a topological 
insulating system Bi 2 Se3 with 12 Bi atoms (large yellow balls) and 18 Se 
atoms (small red balls). 

We present numerical results for computing the total free energy of three 
dimers systems Pt2, Au2, and TIF, as well as a condensed matter system 
Bi 2 Se3. For simplicity of demonstration, the electronic structure calculation 
of the dimers are not computed with their optimal bond lengths. Instead, all 
the dimers are placed in a supercell of dimension 15.000 au, 10.000 au, 10.000 
au along the x, y, z directions respectively. The positions of the dimers are 
placed at (5.000, 0.000, 0.000) au and (10.000, 0.000, 0.000) au, respectively. 
In the HGH pseudopotential, the elements Pt, Au and Tl have the option of 
including semicore electrons [42| or not, and our implementation can handle 
both cases. Here we treat Pt atoms without semicore electrons, and treat Au 
atoms and Tl atoms with semicore electrons. Our implementation assumes 
an orthorhombic supercell. Non-orthorhombic cells can be converted into 
orthorhombic cells for the total energy computation. For example, the topo- 



logical insulator phase of Bi 2 Se3 has rhombohedral crystal structure [60|. The 
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Table 2: The kinetic energy cutoff E cutl the number of valance electrons and 
the treatment of semicore electrons for the systems under consideration. The 
same kinetic energy cutoff is used in both ABINIT and our implementation. 



System 




# Electron 


Semicore 


Pt 2 


50 


20 


No 


Au 2 


50 


22 


Yes 


TIF 


400 


20 


Yes 


Bi 2 Se 3 


45 


168 


No 



rhombohedral structure can be converted into an orthorhombic cell as shown 
in Fig. |3] with 12 Bi atoms and 18 Se atoms in the supercell. The lengths 
of the converted orthorhombic supercell is 7.820 au, 13.544 au, and 54.123 
au along the x, y, z directions, respectively. The kinetic energy cutoff E cut is 
set to be the same in ABINIT and in our implementation, and E cut is high 
enough so that the difference in the total free energy per atom is less than 1 
meV. Neither Bi nor Se allows semicore treatment in HGH pseudopotential. 
More details of the setup of the computational systems are illustrated in Ta- 
ble [2J Note that TIF requires particularly large kinetic energy cutoff. This 
is mainly due to localized pseudopotential of the F atom, which requires a 
very fine numerical grid to resolve. 

4.2. Calibration with ABINIT 



We compare our result with ABINIT [35( which also supports the usage 
of the HGH pseudopotential and the LDA approximation. ABINIT has the 
capability of performing KS calculation (by setting nspinor=l,nspden=l) and 
KSSO calculation with vector magnetization of the spin density (by setting 
nspinor=2,nspden=4). At present, ABINIT does not provide the DKS mod- 
ule. In all simulation results reported below, the temperature is set to be 
5000K only for the purpose of accelerating the convergence of SCF iteration. 



Finite temperature formulation of the KS density functional theory [61 



is 



used, and the total free energy is the quantity to be measured [62|. In par- 
ticular, the total free energy computed from modules KS, KSSO and DKS 
in our implementation are denoted by -Eks, -Eksso, -Edks, respectively, while 
those computed from the corresponding modules KS and KSSO in ABINIT 
are denoted by E^ miT , E^^q 1T , respectively. To facilitate the illustration, 
all the energies will be measured in the unit of au, and all the differences of 
energies will be measured in the unit of meV (lau ~ 27211meV ). 
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Table 3: The total free energies ^ks calculated from ABINIT, and E KS 
from module KS in our implementation for systems under consideration. 



System 




Eks 


(EkS — -E , KS 1N1T )/^ r atom 




au 


au 


meV 


Pt 2 


-52.364236 


-52.364274 


-0.51 


Au 2 


-66.438610 


-66.438601 


0.12 


TIF 


-74.156822 


-74.156828 


-0.08 


Bi 2 Se 3 


-237.357221 


-237.357063 


0.14 



In Table we compare the total free energies for the KS calculation. The 
differences of energies per atom are less than 1 meV in all cases. In Table HJ 
we compare the total free energies for the KSSO calculation. Similarly, the 
differences of energies per atom are also less than 1 meV in all cases. Tables [3] 
and S] indicate that the KS and KSSO modules in our implementation are 
accurate. Comparing the data shown in Tables |3] and HI we find that the 
total free energies are consistently lower when SOC effect is considered. 

Table 4: The total free energies ^ksso IT calculated from ABINIT, and -Eksso 
from module KSSO in our implementation for systems under consideration. 



System 


771AB1N1T 

-^KSSO 


-E'ksso 


(-^KSSO — -^KSSO iX )/^atom 




au 


au 


mev 


Pt 2 


-52.411523 


-52.411564 


-0.56 


Au 2 


-66.473300 


-66.473302 


-0.03 


TIF 


-74.178538 


-74.178602 


-0.87 


Bi 2 Se 3 


-237.728372 


-237.728987 


-0.56 



4.3. DKS results 

Table [5] demonstrates the total free energies computed using the module 
DKS in our implementation, for which ABINIT does not provide the same 
functionality, and shows the differences of the energies per atom between 
calculations using modules KS, KSSO and DKS. The energies obtained from 
DKS is systematically lower than those in KS and KSSO. The full relativis- 
tic correction described by DKS is large (in the order of hundreds of meV 
per atom) compared to the result obtained by KS which only takes into ac- 
count the scalar relativistic effect in the level of pseudopotential. Most of 
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the correction originates from SOC effect. The remaining correction due to 
the difference between the Dirac description (DKS) and the Pauli description 
(KSSO, with the relativistic effect taken account only in the HGH pseudopo- 
tential) is generally two orders of magnitude smaller than the SOC effect. 
Furthermore, since the relativistic effect of the core electrons is more sig- 
nificant than that of the valence electrons, the difference between DKS and 
KSSO is particularly small when the semicore treatment is not present, such 
as in the case of Pt2 and Bi2Se3. With the same HGH pseudopotential (J7|), 
the correction from the Dirac description compared to the Pauli description 
is found to be smaller than the transferability error of the pseudopotentials. 
Our study also agrees with the recent study using the planewave implemen- 
tation of the ZORA equations - one kind of approximate two-component 
relativistic theory by NWChem 
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We also note that in TIF, the difference 
between DKS and KSSO is around 10% of the effect due to SOC. To the 
extent of our knowledge, our result for the first time reveals the quantitative 
difference between the Pauli description and the full Dirac description of rel- 
ativistic effects for the valence electrons in the pseudopotential framework 
for the systems under study. 



Table 5: The total free energies -Ersso from module DKS in our implemen- 
tation for systems under consideration. 



System 


Edks 


(-Edks ~ Eks) / N^om 


(Edks ~ Eksso) / A^tom 




au 


meV 


meV 


Pt 2 


-52.411595 


-643.82 


-0.41 


Au 2 


-66.473702 


-477.56 


-5.44 


TIF 


-74.181219 


-331.85 


-35.60 


Bi 2 Se 3 


-237.730530 


-338.75 


-1.40 



4-4- Computational efficiency 

We demonstrate the computational efficiency of the LOBPCG-F method 
in terms of the wall clock time per SCF iteration for Pt2, Au2 and TIF 
in Table |6j Each SCF iteration consists of 3 LOBPCG iterations for the 
KS and the KSSO calculation, and 3 LOBPCG-F iterations for the DKS 
calculation. The computation of Bi 2 Se3 uses significant amount of virtual 
memory due to the large number of valance electrons in the system, and 
the corresponding computational time is therefore not meaningful and not 
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reported here. This will not be a problem when our parallel implementation 
of our method is introduced in the future. We also remark that compared to 
the standard LOBPCG method, the LOBPCG-F method does not introduce 
any additional memory cost. 

From KS to KSSO the computational time increased by a factor of 5 ~ 6. 
As discussed before, the module KS uses a two-component formulation, and 
the increase of the computational time is mostly due to the change from 
real to complex arithmetic. For computing the same quantity, the complex 
arithmetic is 4 times more expensive than the real arithmetic. The remaining 
difference comes from that KSSO uses complex to complex (c2c) Fourier 
transform, and KS uses real to complex (r2c) and complex to real (c2r) 
Fourier transform. 

From KSSO to DKS the computational time increases by a factor around 
4. In this process, the change of the number of components from 2 to 4 
inevitably increases the computational time by a factor of 2. The remaining 
factor of 2 in the increased computational time originates from the differ- 
ence between the LOBPCG and the LOBPCG-F method. Each LOBPCG 
iteration applies the Hamiltonian to all spinors once, and each LOBPCG-F 
iteration applies the Hamiltonian operator to all spinors for 3 times to fil- 
ter the components in the negative energy band. However, since the linear 
algebra operations (such as the orthogonalization procedure of the spinors) 
are the same in the LOBPCG and the LOBPCG-F method, the usage of 
LOBPCG-F only increases the computational time by a factor around 2 in 
practice. 

Table 6: The wall clock time (in the unit of sec) for performing one step of 
SCF iteration for systems under consideration. 



System 


KS 


KSSO 


DKS 


Pt 2 


4 


26 


113 


Au 2 


6 


37 


155 


TIF 


148 


709 


2787 



Fig. |4] compares the convergence of the SCF iteration for KS, KSSO and 
DKS using Au 2 as an example. The convergence of the SCF iteration is 
measured by the quantity ||Kut — Vm\\ / ||^in||> where V in is the local part of 
the effective potential before each SCF iteration, and V out is the effective 
potential after each SCF iteration. Fig. H] indicates that the usage of the 



22 



LOBPCG-F method does not deteriorate the convergence rate of the SCF 
iteration. Combining Table E] and Fig. HJ we conclude that the LOBPCG-F 
is a robust and efficient method for solving the DKS system in practice. 
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Figure 4: Convergence of the SCF iteration for KS (red dashed line), KSSO 
(blue line with circles) and DKS calculations (black solid line) for Au2- 



5. Conclusion and outlook 

In this manuscript we develop for the first time a simple and efficient 
iterative algorithm, named the LOBPCG-F method, for directly solving the 
Dirac-Kohn-Sham (DKS) equations in the relativistic density functional the- 
ory. By adding an additional filtering step in the preconditioning stage of the 
LOBPCG method, the LOBPCG-F method is able to compute the desired 
eigenvalues and eigenvectors in the positive energy band without computing 
any state in the negative energy band. The LOBPCG-F method requires only 
2 extra computational costs and little extra coding effort, and thus remark- 
ably facilitates the transition from nonrelativistic Kohn-Sham (KS) calcula- 
tions using LOBPCG methods to relativistic DKS calculations in studying 
the relativistic effect such as spin-orbit coupling, without the need of ap- 
proximating the DKS equation by two-component relativistic theory. We 
also remark that even though the LOBPCG-F method is efficient for solving 
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the DKS systems, it is still significantly more expensive to solve the more 
complicated four- component DKS systems compared to the KS systems or 
the KSSO systems. In practice one should decide whether two-component 
theory or the four- component theory should be pursued by balancing the 
desired efficiency and accuracy. 

The efficiency of the filtering step is determined by the factor w/A, where 
A is the spectral gap between the positive and the negative energy band, 
and w is the maximum of the widths of the positive and the negative energy 
band (see Fig.[T]). The smaller the factor is, the more efficiently the filter per- 
forms. It is worth highlighting that the proposed filtering technique is more 
general and can be embedded into other preconditioned iterative solvers, 
while the LOBPCG method is employed in this work for its high efficiency 
in the electronic structure calculation. We demonstrate the applicability of 
the LOBPCG-F method in the pseudopotential framework in the planewave 
basis set, in which the condition A 3> w is satisfied. The planewave basis set 
automatically satisfies the kinetic balance prescription and is free from the 
variational collapse. Our results compared with ABINIT indicate that our 
implementation is accurate, and the LOBPCG-F method does not lead to de- 
terioration in the convergence rate of the SCF iteration. We directly observe 
that the difference between the total energies between the two-component 
KS density functional theory with SOC description (module KSSO) and the 
DKS description (module DKS) is no more than a few tens of meV per atom 
for systems under study, and thus it is not cost-effective to solve the four- 
component DKS problem in the pseudopotential framework for most systems 
in practice. The LOBPCG-F method will be more effective for studying the 
relativistic effects in solving the all-electron DKS system directly when the 
condition A > w is satisfied, such as using the linearized augmented plane- 
wave basis set (LAPW) as in the WIEN2k software, or the localized basis 
set given that the basis set remains well conditioned. This will be our future 
work. 
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